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ABSTRACT 

The  modeling  of  cascade  processes  in  multi-agent  systems 
in  the  form  of  complex  networks  has  in  recent  years  become 
an  important  topic  of  study  due  to  its  many  applications: 
the  adoption  of  commercial  products,  spread  of  disease,  the 
diffusion  of  an  idea,  etc.  In  this  paper,  we  begin  by  identi¬ 
fying  a  desiderata  of  seven  properties  that  a  framework  for 
modeling  such  processes  should  satisfy:  the  ability  to  rep¬ 
resent  attributes  of  both  nodes  and  edges,  an  explicit  rep¬ 
resentation  of  time,  the  ability  to  represent  non-Markovian 
temporal  relationships,  representation  of  uncertain  informa¬ 
tion,  the  ability  to  represent  competing  cascades,  allowance 
of  non-monotonic  diffusion,  and  computational  tractability. 
We  then  present  the  MANCaLog  language,  a  formalism  based 
on  logic  programming  that  satisfies  all  these  desiderata,  and 
focus  on  algorithms  for  finding  minimal  models  (from  which 
the  outcome  of  cascades  can  be  obtained)  as  well  as  how 
this  formalism  can  be  applied  in  real  world  scenarios.  We 
are  not  aware  of  any  other  formalism  in  the  literature  that 
meets  all  of  the  above  requirements. 
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1.  INTRODUCTION  AND  RELATED  WORK 

An  epidemic  working  through  a  population,  cascading  elec¬ 
trical  power  failures,  product  adoption,  and  the  spread  of 
a  mutant  gene  are  all  examples  of  diffusion  processes  that 
can  happen  in  multi-agent  systems  structured  as  complex 
networks.  These  network  processes  have  been  studied  in  a 
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variety  of  disciplines,  including  computer  science  [10],  bi¬ 
ology  [11],  sociology  [8],  economics  [12],  and  physics  [15]. 
Much  existing  work  in  this  area  is  based  on  pre-existing 
models  in  sociology  and  economics  -  in  particular  the  work 
of  [8,  12].  However,  recent  examinations  of  social  networks 
-  both  analysis  of  large  data  sets  and  experimental  -  have 
indicated  that  there  may  be  additional  factors  to  consider 
that  are  not  taken  into  account  by  these  models.  These 
include  the  attributes  of  nodes  and  edges,  competing  diffu¬ 
sion  processes,  and  time.  In  this  paper,  we  outline  seven 
design  criteria  (Section  1.1)  for  such  a  framework  and  in¬ 
troduce  MANCaLog  (Section  2),  which  is  to  the  best  of  our 
knowledge  the  first  logical  language  for  modeling  diffusion 
in  complex  networks  that  meets  these  criteria.  MANCaLog 
is  a  rule-based  framework  (inspired  by  logic  programming) 
that  can  richly  express  how  agents  adopt  or  fail  to  adopt 
certain  behaviors,  and  how  these  behaviors  cascade  through 
a  network.  We  also  introduce  fixed-point  based  algorithms 
that  allow  for  the  calculation  of  the  result  of  the  diffusion 
process  in  Section  3.  Note  that  these  algorithms  are  proven 
not  only  to  be  correct,  but  also  to  run  in  polynomial  time. 
Hence,  our  approach  can  not  only  better  express  many  as¬ 
pects  of  cascades  in  complex  networks,  but  it  can  do  so  in 
a  reasonable  amount  of  time.  We  conclude  by  discussing 
applications  of  MANCaLog  in  Section  4. 

Proofs  of  all  results  stated  in  this  paper  can  be  found  in  the 
appendix. 

1.1  Desiderata  of  Properties 

We  begin  by  identifying  a  set  of  criteria  that  we  believe  a 
framework  for  reasoning  about  cascades  in  complex  networks 
should  satisfy. 

1.  Multiply  labeled  and  weighted  nodes  and  edges. 

Many  existing  frameworks  for  studying  diffusion  in  complex 
networks  assume  that  there  is  only  one  type  of  vertex  that 
may  become  “active”  [10]  or  may  “mutate”  [11,  15]  and  only 
one  possible  relationship  between  nodes.  In  reality,  nodes 
and  edges  often  have  different  properties.  For  instance,  la¬ 
bels  on  edges  can  be  used  to  differentiate  between  strong 
and  weak  ties  (edge  types)  -  a  concept  that  is  well  stud¬ 
ied  [7].  Recently,  such  attributes  of  nodes  have  been  shown 
to  impact  influence  in  a  network  [1]. 

2.  Explicit  Representation  of  Time.  Most  work  in 
the  literature  assumes  static  models,  with  the  exception 
of  the  recent  developments  in  [4,  5,  6],  which  assume  the 
existence  of  a  timestamped  log  referring  to  actions  taken 
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ill  the  network  in  order  to  learn  how  nodes  influence  each 
other.  Though  [4]  tackles  the  problem  of  predicting  the 
time  at  which  a  certain  node  will  take  an  action,  the  au¬ 
thors  make  several  simplifying  assumptions  such  as  mono¬ 
tonicity  of  probability  functions,  probabilistic  independence, 
sub-modularity  and,  most  importantly  for  this  criterion,  a 
modeling  of  time  solely  based  on  temporal  decay  of  influ¬ 
ence.  We  seek  a  richer  model  of  temporal  relationships  be¬ 
tween  conditions  in  the  network  structure,  the  current  state 
of  the  cascades  in  process,  and  how  influence  propagates. 

3.  Non-Markovian  Temporal  Relationships.  Apart 
from  time  being  explicitly  represented,  the  temporal  depen¬ 
dencies  should  be  able  to  span  multiple  units  of  time.  Hence, 
the  “memoryless”  mode  of  a  standard  Markov  process,  where 
only  the  information  of  the  current  state  is  required,  is  insuf¬ 
ficient.  Here,  we  strive  to  create  a  framework  where  depen¬ 
dencies  can  be  from  other  earlier  time  steps.  This  issue  has 
been  previously  studied  with  respect  to  more  general  logic 
programming  frameworks  such  as  [13],  but  to  our  knowledge 
has  not  been  applied  to  social  networks. 

4.  Representation  of  Uncertainty.  As  in  practice  it  is 
not  always  possible  to  judge  the  attributes  of  all  individuals 
in  a  network,  an  element  of  uncertainty  must  be  included. 
However,  in  connection  with  point  7,  this  should  not  be  at 
the  expense  of  tractability.  For  instance,  the  probabilistic 
models  of  [10]  are  normally  addressed  with  simulation  (and 
hence  do  not  scale  well)  as  the  computation  of  the  expected 
number  of  activated  nodes  is  a  ffP-haxA  problem  [3]. 

5.  Competing  Cascades.  Often,  in  real-world  situations 
there  will  be  competing  cascading  processes.  For  example, 
in  evolutionary  graph  theory  [11],  “mutants”  and  “residents” 
compete  for  nodes  in  the  network  -  the  success  of  one  hinges 
on  the  failure  of  the  other. 

6.  Non-Monotonic  Cascades.  In  much  existing  work  on 
cascades  in  complex  networks,  the  number  of  nodes  attain¬ 
ing  a  certain  property  at  each  time  step  can  only  increase. 
However,  if  we  allow  for  competing  cascades  in  the  same 
model,  we  cannot  have  such  a  strong  restriction  as  the  suc¬ 
cess  of  one  cascade  may  come  at  the  expense  of  another. 

7.  Tractability.  The  social  networks  of  interest  in  today’s 
data  mining  problems  often  have  millions  of  nodes.  It  is 
reasonable  to  expect  that  soon  billion-node  networks  will  be 
commonplace.  Any  framework  for  dealing  with  these  prob¬ 
lems  must  be  solvable  in  a  reasonable  amount  of  time  and 
offer  areas  for  practical  improvement  for  further  scalability. 

1.2  Related  Work 

The  above  criteria  can  be  summarized  as  the  desire  to 
design  the  most  expressive  language  for  network  cascades 
possible  while  still  allowing  computation  of  the  outcome  of 
a  diffusion  process  to  be  completed  in  a  tractable  amount 
of  time.  As  a  comparison,  let  us  briefly  describe  some  rel¬ 
evant  related  work.  Perhaps  the  best  known  general  model 
for  representing  diffusion  in  complex  networks  is  the  in¬ 
dependent  cascade/linear  threshold  (IC/LT)  model  of  [10]. 
However,  although  this  framework  was  shown  to  be  capa¬ 
ble  of  expressing  a  wide  variety  of  sociological  models,  it 
assumes  the  Markov  property  and  does  not  allow  for  the 
representation  of  multiple  attributes  on  vertices  and  edges. 
A  more  recent  framework,  social  network  optimization  prob¬ 
lems  (SNOPs)  [14]  uses  logic  programming  to  allow  for  the 
representation  of  attributes,  but  this  framework  does  not  al- 


Figure  1:  Simple  online  social  network  Gsoc •  Solid 
edges  are  labeled  with  strTie,  while  dashed  edges  are 
labeled  with  wkTie.  White  nodes  are  labeled  with 
male,  while  gray  nodes  are  labeled  with  fern.  Arrows 
represent  the  direction  of  the  edge;  double-headed 
edges  represent  two  edges  with  the  same  label. 

low  for  competing  processes  or  non-monotonic  cascades.  A 
related  logic  programming  framework,  competitive  diffusion 
(CD)  [2]  allows  for  competitive  diffusion  and  non-monotonic 
processes  but  does  not  explicitly  represent  time  and  also 
makes  Markovian  assumptions.  Further,  we  also  note  that 
the  semantics  of  CD  yields  a  “most  probable  interpretation” 
that  is  not  a  unique  solution.  Hence,  a  given  model  in 
that  framework  can  lead  to  multiple  and  possibly  contra¬ 
dictory,  outcomes  to  a  cascade  (this  problem  is  avoided  in 
MAN Ca Log).  Another  popular  class  of  models  is  Evolution¬ 
ary  Graph  Theory  (EGT)  [11],  which  is  highly  related  to 
the  voter  model  (VM)  [15].  Although  this  framework  al¬ 
lows  for  competing  processes  and  non-monotonic  diffusion, 
it  also  makes  Markovian  assumptions  while  not  explicitly 
representing  time.  Further,  determining  the  outcome  of  a 
cascade  in  those  models  is  NP-hard,  while  determining  the 
outcome  in  MANCaLog  can  be  accomplished  in  polynomial 
time.  Table  1  lists  how  these  models  compare  to  MANCaLog 
when  considering  our  design  criteria. 

2.  FRAMEWORK 
2.1  Syntax  and  Semantics 

In  this  paper  we  assume  that  agents  are  arranged  in  a 
directed  graph  (or  network)  G  =  (V,E),  where  the  set  of 
nodes  corresponds  to  the  agents,  and  the  edges  model  the 
relationships  between  them.  We  also  assume  a  set  of  labels 
C,  which  is  partitioned  into  two  sets:  fluent  labels  Cf  (labels 
that  can  change  over  time)  and  non-fluent  labels  Cnf  (labels 
that  do  not);  labels  can  be  applied  to  both  the  nodes  and 
edges  of  the  network.  We  will  use  the  notation  Q  =  V  U  E 
to  be  the  set  of  all  components  (nodes  and  edges)  in  the 
network.  Thus,  c  £  Q  could  be  either  a  node  or  an  edge. 

Example  2.1.  We  will  use  the  sample  online  social  net¬ 
work  Gsoc  shown  in  Figure  1  as  the  running  example;  QSOc  is 
used  to  denote  the  set  of  components  of  Gsoc.  Here  we  have 
Cnf  =  {male,fem,  strTie,  wkTie}  representing  male,  female, 
strong  ties  and  weak  ties,  respectively.  Additionally,  we  have 
Cf  =  { visPgA ,  visPgBj  representing  visiting  webpage  A  and 
visiting  webpage  B ,  respectively.  ■ 

In  this  paper,  we  present  a  logical  language  where  we  use 
atoms,  referring  to  labels  and  weights,  to  describe  properties 
of  the  nodes  and  edges.  Though  labels  themselves  could 
be  modeled  as  atoms  instead  of  predicates  (to  model  non¬ 
ground  labelings  that  allow  for  greater  expressibility),  for 
simplicity  of  presentation  we  leave  this  to  future  work.  The 
first  piece  of  the  syntax  is  the  network  atom. 


Table  1:  Comparison  with  other  models 


Criterion 

MANCaLog 

IC/LT  [10] 

SNOP  [14] 

CD  [2] 

EGT/VM  [11] 

1.  Labels 

Yes 

No 

Yes 

Yes 

No 

2.  Explicit  Representation  of  Time 

Yes 

No 

Yes 

No 

Yes 

3.  Non-Markovian  Temporal  Relationships 

Yes 

No 

No 

No 

No 

4.  Uncertainty 

Yes 

Yes 

Yes 

Yes 

Yes 

5.  Competing  Cascades 

Yes 

No 

No 

Yes 

Yes 

6.  Non-monotonic  Cascades 

Yes 

No 

No 

Yes 

Yes 

7.  Tractablity 

PTIME 

#P-hard 

PTIME 

PTIME 

NP-hard 

Definition  2.1  (Network  Atom).  Given  label  L  £  C 
and  weight  interval  bnd  C  [0,1],  then  (L,bnd)  is  a  net¬ 
work  atom.  A  network  atom  is  fluent  (resp.,  non-fluent)  if 
L  £  Cf  (resp.,  L  £  £nf)-  We  use  NA  to  denote  the  set  of 
all  possible  network  atoms. 

Network  atoms  describe  properties  of  nodes  and  edges. 
The  definition  is  intuitive:  L  represents  a  property  of  the 
vertex  or  edge,  and  associated  with  this  property  is  some 
weight  that  may  have  associated  uncertainty  hence  repre¬ 
sented  as  an  interval  bnd,  which  can  be  open  or  closed.  An 
invalid  bound  is  represented  by  0,  which  is  equivalent  to  all 
other  invalid  bounds. 

Definition  2.2  (World).  A  world  W  is  a  set  of  net¬ 
work  atoms  such  that  for  each  L  £  C  there  is  no  more  than 
one  network  atom  of  the  form  ( L ,  bnd)  in  W. 

A  network  formula  over  A  A  is  defined  using  conjunc¬ 
tion,  disjunction,  and  negation  in  the  usual  way.  If  a  for¬ 
mula  contains  only  non-fluent  (resp.,  fluent)  atoms,  it  is  a 
non- fluent  (resp.,  fluent)  formula. 

Definition  2.3  (Satisfaction  of  Worlds).  Given  a 
world  W  and  network  formula  f,  satisfaction  of  W  by  f 
( denoted  W  \=  f)  is  defined: 

•  If  f  =  {L,  [0, 1]>  thenW\=f. 

•  If  f  =  (L,  0)  thenW^f. 

•  If  f  =  ( L,bnd ),  with  bnd  ^  0  and  bnd  ^  [0,1],  then 
W  \=  f  iff  there  exists  ( L ,  bnd1)  £  W  s.t.  bnd1  C  bnd. 

•  Iff  =  ^f  then  W  \=  f  iff  W  f  ■ 

•Iff  =  fiAf2  then  W\=f  iffW\=f i  and  W  |=  f2. 

•  Iff  =  h  V/2  then  W\=fiffW\=fl0rW\=  f2. 

For  some  arbitrary  label  L  £  £,  we  will  use  the  nota¬ 
tion  Tr  =  ( L ,  [0, 1])  and  F  =  ( L ,  0)  to  represent  a  tautology 
and  contradiction,  respectively.  For  ease  of  notation  (and 
without  loss  of  generality),  we  say  that  if  there  does  not 
exist  some  bnd  s.t.  ( L ,  bnd)  £  W,  then  this  implies  that 
(L,  [0, 1])  €  W. 

Example  2.2.  Following  from  Example  2.1,  the  network 
atom  ( female ,  [1, 1])  can  be  used  to  identify  a  node  as  a 
woman.  Likewise,  the  world  W  = 

{(fem,  [1, 1[),  (male,  [0,  0 }),(visPgA,  [1, 1[),  (visPgB,  [0,  0])| 


might  be  used  to  identify  a  woman  who  visits  webpage  A. 
Clearly,  we  have  that  W  )= 

(fem,  [1, 1])  A  -i (visPgA,  [0.5,  0.9])  A  ->( visPgB ,  [0.1, 0.7]) 

Note  that  the  network  atoms  formed  with  strTie  and  wkTie 
are  not  present;  this  could  be  due  to  the  fact  that  such  a 
world  is  used  to  describe  a  node  and  not  an  edge,  and  hence 
there  is  no  information  about  those  two  labels.  As  such  is 
the  case,  W  |=  ( strTie ,  [0, 1[)  A  (wkTie,  [0, 1]).  ■ 

The  idea  is  to  use  MANCaLog  to  describe  how  properties 
(specified  by  labels)  of  the  nodes  in  the  network  change  over 
time.  We  assume  that  there  is  some  natural  number  tmax 
that  specifies  the  total  amount  of  time  we  are  considering, 
and  we  use  r  =  {t  \  t  £  [0 ,tmax}}  to  denote  the  set  of  all 
time  points.  How  well  a  certain  property  can  be  attributed 
to  a  node  is  based  on  a  weight  (to  which  the  bnd  bound  in 
the  network  atom  refers).  As  time  progresses,  a  weight  can 
either  increase/decrease  and/or  become  more/less  certain. 
We  now  introduce  the  MANCaLog  fact,  which  states  that 
some  network  atom  is  true  for  a  node  or  edge  during  certain 
times. 

Definition  2.4  (MANCaLog  Fact).  If[ti,t2]  C  [0 ,tmax 
c  £  Q,  and  a  £  NA,  then  ( a,c )  :  [ti,t2]  is  a  MANCaLog 
fact.  A  fact  is  fluent  (resp.,  non-fluent)  if  atom  a  is  fluent 
(resp.,  non-fluent).  All  non-fluent  facts  must  be  of  the  form 
(a,  c)  :  [0,  tmax\  ■  Let  T  be  the  set  of  all  facts  and  Tnf,Tg  be 
the  set  of  all  non-fluent  and  fluent  facts,  respectively. 

Example  2.3.  Following  from  Example  2.2,  the  following 
facts  are  based  on  Figure  1: 

F\  =  ((male,[l,l]),l)  :  [Q,tmax\ 

F2  =  ((fem,  [1, 1])  ,  1)  :  [0,  tmax\ 

F3  =  ((male,  [1, 1]),  3)  :  [0,tmax] 

F4  =  ((strTie,  [1, 1]),  (1,  2))  :  [0,tmax\ 

F5  =  ((strTie,  [1, 1]),  (2, 1)) 

■  [0,  tmax\ 

F6  =  ((wkTie,  [1,1]),  (2, 3))  :  [0,  tmax] 

F7  =  ((visPgA,  [0.8, 1.0])  ,  1)  :  [0,  tmax\ 

Fs  =  ((msPpA,  [0.5, 1.0]),  2)  :  [0,tmax] 

For  instance,  agent  1  is  male,  and  has  a  strong  tie  to  agent  2, 
who  is  female.  m 

Next,  we  introduce  integrity  constraints  (ICs). 

Definition  2.5.  Given  fluent  network  atom  a  and  con¬ 
junction  of  network  atoms  b,  an  integrity  constraint  is  of  the 
form  a  ■£-•  b. 


Intuitively,  integrity  constraint  (L,  bnd)  <— >  b  means  that 
if  at  a  certain  time  point  a  component  (vertex  or  edge)  of 
the  network  has  a  set  of  properties  specified  by  conjunction 
b,  then  at  that  same  time  the  component’s  weight  for  label 
L  must  be  in  interval  bnd. 

Example  2.4.  Following  from  the  previous  examples,  the 
integrity  constraint  (male,  [0,0])  <— >  (fem,[  1,1])  would  re¬ 
quire  any  node  designated  as  a  female  to  not  be  male.  ■ 

We  now  define  MANCaLog  rules.  The  idea  behind  rules  is 
simple:  an  agent  that  meets  some  criteria  is  influenced  by 
the  set  of  its  neighbors  who  possess  certain  properties.  The 
amount  of  influence  exerted  on  an  agent  by  its  neighbors  is 
specified  by  an  influence  function,  whose  precise  effects  will 
be  described  later  on  when  we  discuss  the  semantics.  As  a 
result,  a  rule  consists  of  four  major  parts:  (i)  an  influence 
function,  (ii)  neighbor  criteria,  (iii)  target  criteria,  and  (iv)  a 
target.  Intuitively,  (i)  specifies  how  the  neighbors  influence 
the  agent  in  question,  (ii)  specifies  which  of  the  neighbors 
can  influence  the  agent,  (iii)  specifies  the  criteria  that  cause 
the  agent  to  be  influenced,  and  (iv)  is  the  property  of  the 
agent  that  changes  as  a  result  of  the  influence. 

We  will  discuss  each  of  these  parts  in  turn,  and  then  define 
rules  in  terms  of  these  elements.  First,  we  define  influence 
functions  and  neighbor  criteria. 

Definition  2.6  (Influence  Function).  An  influence 
function  is  a  function  ifl  :  N  x  IV  — ►  [0, 1]  x  [0, 1]  that  sat¬ 
isfies  the  following  two  axioms: 

1.  ifl  can  be  computed  in  constant  (0(1))  time. 

2.  For  x'  >  x  we  have  ifl{x',y)  C  ifl{x,y). 

We  use  IFL  to  denote  the  set  of  all  influence  functions. 

Intuitively,  an  influence  function  takes  the  number  of  qual¬ 
ifying  influencers  and  the  number  of  eligible  influencers  and 
returns  a  bound  on  the  new  value  for  the  weight  of  the  prop¬ 
erty  of  the  target  node  that  changes.  In  practice,  we  expect 
the  time  complexity  of  such  a  function  to  be  a  polynomial 
in  terms  of  the  two  arguments.  However,  as  both  arguments 
are  naturals  bounded  by  the  maximum  degree  of  a  node  in 
the  network,  this  value  will  be  much  smaller  than  the  size 
of  the  network  -  we  thus  treat  it  as  a  constant  here. 


Intuitively,  the  above  function  says  that  an  agent  adopts  a 
behavior  with  a  weight  of  at  least  0.7  if  half  of  the  incoming 
neighbors  that  have  some  attribute  and  meet  some  criteria, 
and  we  have  no  information  otherwise.  Another  possibility 
is  to  have  an  influence  function  that  may  reduce  the  weight 
that  an  agent  adopts  a  certain  behavior: 


ngTp(x,y) 


[0.0, 0.2]  ifx  =  y 
[0.0, 1.0]  otherwise 


The  ngTp  function  says  that  an  agent  will  adopt  a  behav¬ 
ior  with  a  weight  no  greater  than  0.2  if  all  of  the  incoming 
neighbors  possessing  some  property  meet  some  criteria,  and 
that  we  have  no  information  otherwise.  m 


Definition  2.7  (Neighbor  Criterion).  If  gedge, gnode 
are  non-fluent  network  formulas  (formed  over  edges  and  nodes, 
respectively),  h  is  a  conjunction  of  network  atoms,  and  ifl  is 
an  influence  function,  then  {gedge,  gnode,  h)ip  is  a  neighbor 
criterion. 


Formulas  gnode  and  h  in  a  neighbor  criterion  specify  the 
(non-fluent  and  fluent,  respectively)  criteria  on  a  given  neigh¬ 
bor,  while  formula  gedge  specifies  the  non-fluent  criteria  on 
the  directed  edge  from  that  neighbor  to  the  node  in  question. 

The  next  component  is  the  “target  criteria”,  which  are  the 
criteria  that  an  agent  must  satisfy  in  order  to  be  influenced 
by  its  neighbors.  Ideas  such  as  “susceptibility”  [1]  can  be 
integrated  into  our  framework  via  this  component.  We  rep¬ 
resent  these  criteria  with  a  formula  of  non-fluent  network 
atoms.  The  final  component,  the  “target”,  is  simply  the  la¬ 
bel  of  the  target  agent  that  is  influenced  by  its  neighbors. 
Hence,  we  now  have  all  the  pieces  to  define  a  rule. 

Definition  2.8  (Rule).  Given  fluent  label  L,  natural 
number  A t,  target  criteria  f  and  neighbor  criteria 
{gedge,  gnode,  h)ifl,  a  MANCaLog  Rule  is  of  the  form: 

r  —  L  <—  /,  {gedge,  gnode,  h)  ifl 

We  will  use  the  notation  head(r)  to  denote  L. 


Example  2.5.  The  well-known  “tipping  model”  originally 
introduced  in  [8,  12]  states  that  an  agent  adopts  a  behavior 
when  a  certain  fraction  of  his  incoming  neighbors  do  so.  A 
common  tipping  function  is  the  majority  threshold  where 
at  least  half  of  the  agent’s  neighbors  must  previously  adopt 
the  behavior.  We  can  represent  this  using  the  following  in¬ 
fluence  function: 


tip{x,y ) 


[1.0, 1.0]  ifx/y>  0.5 
[0.0, 1.0]  otherwise 


This  function  says  that  an  agent  adopts  a  certain  behavior 
if  at  least  half  of  his  incoming  neighbors  have  some  property 
(strong  ties,  weak  ties,  meet  some  requirement  of  gender, 
income,  etc.)  and  that  we  have  no  information  otherwise. 
In  our  framework,  we  can  leverage  the  bounds  associated  with 
the  influence  function  to  create  a  “soft”  tipping  function: 


sftTp(x,  y) 


[0.7, 1.0]  ifx/y>  0.5 
[0.0, 1.0]  otherwise 


Note  that  the  target  (also  referred  to  as  the  head)  of  the  rule 
is  a  single  label;  essentially,  the  body  of  the  rule  characterizes 
a  set  of  nodes,  and  this  label  is  the  one  that  is  modified  for 
each  node  in  this  set.  More  specifically,  the  rule  is  essentially 
saying  that  when  certain  conditions  for  an  agent  and  its 
neighbors  are  met,  the  bnd  bound  for  the  network  atom 
formed  with  label  L  on  that  agent  changes.  Later,  in  the 
semantics,  we  introduce  network  interpretations,  which  map 
components  (nodes  and  edges)  of  the  network  to  worlds  at 
a  given  point  in  time.  The  rule  dictates  how  this  mapping 
changes  in  the  next  time  step. 

Definition  2.9  (MANCaLog  Program).  A  program  P 
is  a  set  of  rules,  facts,  and  integrity  constraints  s.t.  each 
non-fluent  fact  F  £  Tn /  appears  no  more  than  once  in  the 
program.  Let  P  be  the  set  of  all  programs. 

Example  2.6.  Following  from  the  previous  examples,  we 
can  have  a  MANCaLog  program  that  leverage  the  sftTp  and 
ngTp  influence  functions  in  rules  that  are  more  expressive 


Table  2:  Example  network  interpretation,  NI\. 


Comp. 

male 

fem 

strTie 

wkTie 

visPgA 

visPgB 

1 

[1,1] 

[o,o[ 

- 

- 

]0.9, 1.0] 

]0.8, 1.0] 

2 

10,0] 

[1,1] 

- 

- 

[0.0,  0.3] 

[0.0,  0.2] 

3 

[1, 1] 

[0,0] 

- 

- 

[0.6, 1.0] 

[0.0,  0.2] 

4 

L0,0| 

[1,1] 

- 

- 

[0.0,  0.2] 

[0.9, 1.0] 

5 

[i,  i] 

[0,0] 

- 

- 

[0.0,  0.2] 

[0.7, 1.0] 

(1,2) 

- 

- 

]i,ii 

|0,0| 

- 

- 

- 

- 

[1,1] 

“[070] 

- 

- 

Wb3) 

- 

- 

Tro] 

]1,1] 

- 

- 

(2,3) 

- 

- 

[0,0] 

Li,  i] 

- 

- 

(3,4) 

- 

- 

[1,1] 

[0,  o| 

- 

- 

(4,3) 

- 

- 

Li,  iJ 

|0,0| 

- 

- 

(4,5) 

- 

- 

]i,i| 

[0,0| 

- 

- 

than  previous  models.  Consider  the  following  rules: 

2 

Ri  =  visPgA  «— 

(. fem ,  [1, 1]>,  (( strTie ,  [0.9, 1]},  Tr,  ( visPgA ,  [0.9, 1.0]))s^Tp 
f>2  =  visPgB  £- 

(male,  [1, 1]>,  (Tr,  Tr,  (visPgB,  [0.8, 1.0]))s/iTp 

3 

R3  =  visPgA  <— 

(male,  [1, 1]>,  (Tr,  (fem,  [1, 1]>,  ^(visPgA,  [0.7, 1.0]))ngTp 

Rule  Ri  says  that  a  female  agent  in  the  network  visits  page 
A  with  a  weight  of  at  least  0.7  (this  is  specified  in  the  sftTp 
influence  function)  if  at  least  half  of  her  strong  ties  (with 
weight  of  at  least  0.9,)  visited  the  page  (with  a  weight  of  at 
least  0.9,)  two  days  ago.  The  rest  of  the  rules  can  be  read 
analogously.  ■ 

We  now  introduce  our  first  semantic  structure:  the  net¬ 
work  interpretation. 

Definition  2.10  (Network  Interpretation).  A  net¬ 
work  interpretation  is  a  mapping  of  network  components  to 
sets  of  network  atoms,  NI  :  Q  — »  2NA.  We  will  use  NI  to 
denote  the  set  of  all  network  interpretations. 

We  note  that  not  all  labels  will  necessarily  apply  to  all 
nodes  and  edges  in  the  network.  For  instance,  certain  labels 
may  describe  a  relationship  while  others  may  only  describe 
a  property  of  an  individual  in  the  network.  If  a  given  label  L 
does  not  describe  a  certain  component  c  of  the  network,  then 
in  a  valid  network  interpretation  NI,  (L,  [0, 1])  £  NI(c). 

Example  2.7.  Consider  G'soc,  the  induced  subgraph  of  GSOc 
that  has  only  nodes  {1,  2, 3, 4,  5}.  Table  2  shows  the  contents 
of  NIi ,  an  example  network  interpretation.  ■ 

We  define  a  MANCaLog  interpretation  as  follows. 

Definition  2.11  (Interpretation).  AMANCaLogin- 
terpretation  I  is  a  mapping  of  natural  numbers  in  the  inter¬ 
val  [0 ,tmax\  to  network  interpretations,  i.e.,  I  :  N  — »  NI. 
Let  T  be  the  set  of  all  possible  interpretations. 

2.2  Satisfaction 

First,  we  define  what  it  means  for  an  interpretation  to 
satisfy  a  fact  and  a  rule. 

Definition  2.12  (Fact  Satisfaction).  An  interpreta¬ 
tion  I  satisfies  MANCaLog  fact  ( a,c )  :  [ti,t2],  written  I  |= 
(a,c)  :  [ti,t2\,  iffVte  [ti,t2],  I(t)(c )  \=  a. 


Example  2.8.  Consider  interpretation  Ii,  where  7i(0)  = 
NIi  (from  Example  2.7),  and  MANCaLog  facts  F-j  and  Fg 
from  Example  2.3.  In  this  case,  1 1  |=  Fj  and  h  Fg.  m 

For  non- fluent  facts,  we  introduce  the  notion  of  strict  sat¬ 
isfaction,  which  enforces  the  bound  in  the  interpretation  to 
be  set  to  exactly  what  the  fact  dictates. 

Definition  2.13  (Strict  Fact  Satisfaction).  Inter¬ 
pretation  I  strictly  satisfies  MANCaLog  fact  ( c,a )  :  [ti,t2] 
iff 'it  £  [ti,t2\,  a  £  /(f) (c). 

Next,  we  define  what  it  means  for  an  interpretation  to 
satisfy  an  integrity  constraint. 

Definition  2.14  (IC  Satisfaction).  An  interpretation 
I  satisfies  integrity  constraint  a  <— >  b  iff  for  all  t  £  r  and 
c  £  Q,  7(f) (c)  |=  -ib  V  a. 

Before  we  define  what  it  means  for  an  interpretation  to 
satisfy  a  rule,  we  require  two  auxiliary  definitions  that  are 
used  to  define  the  bound  enforced  on  a  label  by  a  given  rule, 
and  the  set  of  time  points  that  are  affected  by  a  rule. 

Definition  2.15  (Bound  function).  For  a  given  rule 

r  =  L  f,  (gedge,  gnode,  h)iji,  node  v,  and  network  interpre¬ 
tation  NI,  Bound(r,v,  NI)  = 

ifl(^Qual(v,  gedg  e  5  Qnode  5  ^5  NI)  j,  ^Elig(y, gedge, 9node, Nl)^j , 
where  Elig(v,  gedge,  gnode,  NI)  = 

{«'  £  v  I  NI(v')  |=  gnode A(v',v)  £  EANl((v',v))  |=  gedgej 

CLTld,  QuOil^V ,  g edge-)  Qnode  ?  ^5  ^ -0 

jV  £  Elig(v,  gedge,  gnode,  N I)  \  NI(v')  |=  ft} 

Intuitively,  the  bound  returned  by  the  function  depends  on 
the  influence  function  and  the  number  of  qualifying  and  el¬ 
igible  nodes  that  influence  it. 

Definition  2.16  (Target  Time  Set).  For  interpreta¬ 
tion  I,  node  v,  and  rule  r  =  L  ¥  f,  (gedge,  gnode,  h)ift,  the 
target  time  set  of  I,  v,  r  is  defined  as  follows: 

TTS(I,  v,  r)  =  jt  £  [0 ,tmax\  |  I(t  -  A t)(v)  |=  /  j 

We  also  extend  this  definition  to  a  program  P,  for  a  given 
c  £  Q  and  L  £  C,  as  follows;  TTS (I,  c,  L,P)  = 

U  TTS(I,c,r)u{t  £  [ti,t2]  I  (( L,bnd),c )  :  [ti,t2]  £  P} 

r£P,head(r)=L 

U  [t  |  (L,  bnd)  b  £  P  A  I(t) (c)  \=  fi| 

We  can  now  define  satisfaction  of  a  rule  by  an  interpretation. 

Definition  2.17.  An  interpretation  I  satisfies  a  rule  r  = 
f,  (gedge,  gnode,  h)  ifl  iff  f or  all  v  €  V  and  t  £  TTS  (I,  v,  r) 
it  holds  that 

I(t)(v)  [=  (^L,  Bound(r,v,I(t  —  A t))^. 


Example  2.9.  Let  I\  be  the  interpretation  from  Exam¬ 
ple  2.8.  Suppose  that  (visPgB,  [0.8, 1.0]}  €  /(1)(5).  In  this 
case,  I\  \=  Ri-  Let  I2  be  equivalent  to  1 1  except  that  we  have 
{ visPgB ,  [0.0,  0.5]}  G  72(1)(3).  In  this  case,  I2  -R2.  ■ 

We  now  define  satisfaction  of  programs,  and  introduce 
canonical  interpretations,  in  which  time  points  that  are  not 
“targets”  retain  information  from  the  last  time  step. 

Definition  2.18.  For  interpretation  I  and  program  P: 

I  is  a  model  for  P  iff  it  satisfies  all  rules,  integrity  con¬ 
straints,  and  fluent  facts  in  that  program,  strictly  satisfies 
all  non-fluent  facts  in  the  program,  and  for  all  L  £  C,  c  £  Q 
andti  TTS(I,  c,  L,  P),  (L,  [0, 1])  G  7(c)(1). 

I  is  a  canonical  model  for  P  iff  it  satisfies  all  rules,  in¬ 
tegrity  constraints,  and  fluent  facts  in  P,  strictly  satisfies 
all  non-fluent  facts  in  P,  and  for  all  L  £  C,  c  £  Q,  and 
t  ^  TTS(I,c,L,P),  (L,  [0, 1]}  €  7(c)(1)  when  t  —  0  and 
(L,bnd)  £  7(l)(c)  where  (L,bnd)  £  I(t  —  l)(c),  otherwise. 

Example  2.10.  Following  from  previous  examples,  if  we 
consider  interpretation  7i  and  program  P  =  {Fj,R2},  we 
have  that  { visPgB ,  [0.0,  0.2]}  must  be  in  7i(l)(2)  in  order  for 
h  to  be  canonical.  ■ 

2.3  Consistency  and  Entailment 

In  this  section  we  discuss  consistency  and  entailment  in 
MANCaLog  programs,  and  explore  the  use  of  minimal  mod¬ 
els  towards  computing  answers  to  these  problems. 

Definition  2.19  (Consistency).  A  MANCaLog  pro¬ 
gram  P  is  (canonically)  consistent  iff  there  exists  a  (canon¬ 
ical)  model  I  of  P. 

Definition  2.20  (Entailment).  A  MANCaLog  program 
P  (canonically)  entails  MANCaLog  fact  F  iff  for  all  (canon¬ 
ical)  models  I  of  P,  it  holds  that  I  (=  F . 

Now  we  define  an  ordering  over  models  and  define  the  con¬ 
cept  of  minimal  model.  We  then  show  that  if  we  can  find  a 
minimal  model  then  we  can  answer  consistency,  entailment, 
and  tight  entailment  queries.  To  do  so,  we  first  define  a 
pre-order  over  interpretations. 

Definition  2.21  (Preorder  over  Interpretations). 
Given  interpretations  I,  I'  we  say  I  Cpre  /'  if  and  only  if  for 
all  t,v,L  if  there  exists  ( L,bnd )  £  I(t)(v)  then  there  must 
exist  ( L ,  bnd ')  £  I'(t)(v)  s.t.  bnd'  C  bnd. 

Next,  we  define  an  equivalence  relation  for  interpretations 
denoted  with  we  will  use  the  notation  [/]  for  the  set  of 
all  interpretations  equivalent  to  I  w.r.t.  ~.  This  allows  us 
to  define  a  partial  ordering. 

Definition  2.22.  Two  interpretations  I,  I'  are  equiva¬ 
lent  (written  I  ~  I')  iff  for  all  P  £  P,  I  |=  P  iff  I'  |=  P. 

Definition  2.23  (Partial  Ordering).  Given  classes 
of  interpretations  [/],  [I']  that  are  equivalent  w.r.t.  we  say 
that  [I]  precedes  [/'],  written  [7]  □  [7'],  iff  I  Cpre  7'. 

The  partial  ordering  is  clearly  reflexive,  antisymmetric, 
and  transitive.  Note  that  we  will  often  use  7  □  I'  as  short¬ 
hand  for  [7]  C  [7'].  We  define  two  special  interpretations,  _L 
and  T,  such  that  Vl  £  t,c  £  Q ,  _L(l)(c)  =  0  and  there  exists 


network  atom  (L,  0}  £  T  (l)(c).  Clearly,  no  other  interpreta¬ 
tion  can  be  below  _L  as  the  [l,  u]  bound  on  all  network  atoms 
for  each  time  step  and  each  component  is  [0, 1];  similarly,  no 
other  interpretation  is  above  T,  since  for  any  interpretation 
7  for  which  there  exists  (L,  bnd)  £  7(1)  (c)  where  bnd  0, 
we  have  0  C  bnd.  We  can  prove  (see  the  full  version  of  the 
paper  for  details)  that  with  T  and  _L,  (I,  C)  is  a  complete 
lattice.  We  can  now  arrive  at  the  notion  of  minimal  model 
for  a  MANCaLog  program. 

Definition  2.24  (Minimal  Model).  Given  program  P , 
the  minimal  model  of  P  is  a  (canonical)  interpretation  I  s.t. 

I  \=  P  and  for  all  (canonical)  interpretation  I'  s.t.  I'  |=  P, 
we  have  that  7  C  7'. 

Suppose  we  have  some  algorithm  A  that  takes  as  input 
a  program  P  and  returns  an  interpretation  7  (where  7  does 
not  necessarily  satisfy  P)  s.t.  for  all  7'  where  I'  |=  P,  I  C.  7'. 

It  is  easy  to  show  that  if  A(P)  |=  P  then  P  is  consistent. 
Likewise,  if  A(P)  =  T  then  P  is  inconsistent,  as  all  mod¬ 
els  must  then  have  a  tighter  weight  bound  for  the  network 
atoms  than  an  invalid  interpretation  (hence,  making  such  an 
interpretation  invalid  as  well).  Clearly,  any  such  algorithm 
A  would  provide  a  sound  and  complete  answer  to  the  consis¬ 
tency  problem.  Likewise,  if  we  consider  the  entailment  prob¬ 
lem,  it  is  easy  to  show  that  for  fact  F  =  ((L,  bnd),  c)  :  [li,  £2], 

P  (canonically)  entails  F  iff  the  minimal  model  of  P  (canon¬ 
ically)  satisfies  F.  This  is  because  for  minimal  model  A(P) 
of  P,  for  any  time  t  £  [ti ,  ^2] ,  if  A(P)(t)(c)  |=  {L,  bnd)  then 
there  is  network  atom  (L,  bnd')  £  A(P)(t)(c)  s.t.  bnd '  C 
bnd.  We  note  that  for  any  other  interpretation  7  of  P  with 
(L,  bnd")  £  I(t)(c)  we  have  that  bnd'  D  bnd" .  Hence,  hav¬ 
ing  a  minimal  model  allows  us  to  solve  any  entailment  query. 
We  can  think  of  a  minimal  model  of  a  MANCaLog  program 
as  the  outcome  of  a  diffusion  process  in  a  multi-agent  system 
modeled  as  a  complex  network.  Hence,  a  question  such  as 
“how  many  agents  will  adopt  the  product  with  a  weight  of 
at  least  0.9  in  two  months?”  can  be  easily  answered  once 
the  minimal  model  is  obtained. 

3.  FIXED  POINT  MODEL  COMPUTATION 

In  this  section  we  introduce  a  fixed-point  operator  that 
produces  the  non-canonical  minimal  model  of  a  MANCaLog 
program  in  polynomial  time.  This  is  followed  by  an  algo¬ 
rithm  to  find  a  canonical  minimal  model  also  in  polynomial 
time.  First,  we  introduce  three  preliminary  definitions. 

Definition  3.1.  For  a  given  MANCaLog  program  P,  c  £ 

Q,  L  £  C,  and  t  £  r  we  define  function  FBnd(P,  c,  t,  L)  — 

bnd 

((L,bnd)  ,c):[ti,t2]£P  s.t.  i€[ti,£2] 

Definition  3.2.  For  a  given  MANCaLog  program  P,  c  £ 

Q,  L  £  C,  and  t  £  r  we  define  function  IBnd(P,  c,  t,  L)  = 

bnd 

(L,bnd)<-^a£P  s.t.  J(t)(c)|=a 

Definition  3.3.  Given  MANCaLog  program  P,  interpre¬ 
tation  I,  v  £  V ,  L  £  C,  and  t  £  t,  we  define  RBnd(P,  I,  v,  t,  L ) 

P'1  Bound(r,v,  I(t  —  At)) 

r€P  s.t.  teTTS(I,v,L,P)nTTS(I,v1r) 


We  can  now  introduce  the  operator. 

Definition  3.4  (T  Operator).  For  a  given  MANCaLog 
program  P,  we  define  the  operator  T p  :  T  -a  X  as  follows: 
For  a  given  I,  for  each  t  £  t ,  c  £  Q ,  and  L  £  £,  add  (£,  bnd) 
to  TP{I)(t)(c)  where  bnd  is  defined  as: 

bnd  =  bndprv  H  FBnd(P,  c,  t,  L)  n 

IBnd(P ,  I,  c,  t,  L)  n  RBnd(P ,  7,  c,  t,  L) 
where  ( L ,  bndprv }  £  7(t)(c). 

It  is  easy  to  show  that  V  can  be  computed  in  polynomial 
time  (the  proof  is  in  the  full  version).  Next,  we  introduce 
notation  for  repeated  applications  of  T. 


Definition  3.5  (Iterated  Applications  of  F).  Given 
natural  number  i  >  0,  interpretation  I ,  and  program  P,  we 
define  rP(7),  the  multiple  applications  ofT,  as  follows: 


Fp(/) 


FP(I)  ifi=  1 

rP(rlp-1(7))  otherwise 


We  can  prove  that  the  iterated  T  operator  converges  after  a 
polynomial  number  of  applications: 


Theorem  3.1.  Given  interpretation  I  and  program  P, 
there  exists  a  natural  number  k  s.t.  FP(7)  =  rp+1(7),  and 

keo(\P\-din-tmax-\E\) 

where  d is  the  maximum  in-degree  in  the  network. 

Proof  (sketch).  For  a  given  vertex  i  £  V,  we  will  use 
the  notation  d]n  to  denote  the  number  of  incoming  neighbors 
(of  any  edge  type).  First  note  that  for  a  given  t  £  r,i  £  V, 
and  L  £  £,  a  given  rule  r  can  tighten  the  bound  on  a  network 
atom  formed  with  L  no  more  than  (d)n  +  1)  •  (d™  +  1)  times. 
At  each  application  of  T,  at  least  one  network  atom  must 

tighten.  Hence,  as  there  are  only  •  dT  ■  tmax  •  |£|) 

tightenings  possible,  this  is  also  the  bound  on  the  number 
of  applications  of  T.  □ 


In  the  following,  we  will  use  the  notation  rP  to  denote  the 
iterated  application  of  T  after  a  number  of  steps  sufficient 
for  convergence;  Theorem  3.1  means  that  we  can  efficiently 
compute  rP.  We  also  note  that  as  a  single  application  of  T 
can  be  computed  in  polynomial  time,  this  implies  that  we 
can  find  a  minimal  model  of  a  MANCaLog  program  in  poly¬ 
nomial  time.  We  now  prove  the  correctness  of  the  operator. 
We  do  this  first  by  proving  a  key  lemma  that,  when  com¬ 
bined  with  a  claim  showing  that  for  consistent  program  P, 
FP  is  a  model  of  P,  tells  us  that  TP  is  a  minimal  model  for  P. 
Following  directly  from  this,  we  have  that  P  is  inconsistent 
iff  TP  =  T. 

Lemma  3.2.  If  I  \=  P  and  I'  □  I  then  r(7')  C  I. 

Theorem  3.3.  If  program  P  is  consistent  then  TP  is  a 
minimal  model  for  P . 

These  results,  when  taken  together,  prove  that  tight  entail- 
ment  and  consistency  problems  for  MANCaLog  can  be  solved 
in  polynomial  time,  which  is  precisely  what  we  set  out  to  ac¬ 
complish  as  part  of  our  desiderata  described  in  Section  1.1. 
Next,  we  develop  an  algorithm  for  the  canonical  versions 


Algorithm  1  CANON_PROC 

Require:  Program  P 
Ensure:  Interpretation  I 

1:  cur_interp  =  rp(_L); 

2:  Initialize  matrix  array  cur_free[-]  [■]  where  for  v  £  V,  and 
L  £  C,  cur  free[v] [ L\  =  r  —  T TS ( cur  i n I, erp .  v,  L,  P)  —  {0}; 
3:  Initialize  array  vljpr[-]  where  for  each  t  £  [1,  tmax],  vl-pr[t]  = 
{( v,L )  |  t  £  cur  free[v]  \ L]  [ ; 

4:  for  t  =  1, . . .  ,tmax  do 
5:  if  vl_pr[t]  0  then 

6:  for  ( v ,  L)  £  vlupr[t]  do 

7:  Remove  ( L,bnd )  from  7(t)(u); 

8:  Let  a  be  the  atom  in  I(t  —  l)(i>)  of  the  form  (L,  bnd')', 

9:  Add  a  to  7(4)(r); 

10:  end  for 

11:  Set  cur-interp  =  rp(eur  inl.erp); 

12:  end  if 

13:  For  v  £  V,  and  L  £  C,  cur_free[v][L]  =  r  — 

TTS (curAnterp,  v,  L ,  P)  —  (0, .  . . ,  t} 

14:  For  each  t  £  [t  +  l,tmax],  vljpr[t]  =  {(v,L)  |  t  £ 

cur_free[v\  \ L) } 

15:  end  for 
16:  return  I 


of  consistency  and  tight  entailment,  and  show  that  we  can 
bound  the  running  time  of  the  algorithm  with  a  polynomial. 
We  also  note  that  subsequent  runs  of  the  convergence  of  T 
will  likely  complete  quicker  in  practice,  as  the  initial  inter¬ 
pretation  is  the  last  interpretation  calculated  (cf.  line  11). 
We  also  show  that  the  interpretation  produced  by  the  algo¬ 
rithm  is  a  canonical  minimal  model.  Following  from  that,  a 
program  is  inconsistent  iff  the  algorithm  returns  T. 

Proposition  3.1.  Algorithm  CANON_PROC  performs  no 
more  than  l+fm(u:-min(|£|,  |P|)-|V|  calculations  of  the  con¬ 
vergence  o/F. 

Theorem  3.4.  If  P  is  consistent,  then  CANON_PROC(P) 
is  the  minimal  canonical  model  of  P. 

4.  APPLICATIONS 

In  this  section,  we  will  briefly  discuss  work  in  progress  on 
how  MANCaLog  can  be  applied  in  real  world  settings. 

It  is  widely  acknowledged  that  modeling  influence  in  multi¬ 
agent  systems  (most  usefully  modeled  as  complex  networks) 
is  highly  desirable  for  many  practical  problems  as  varied 
as  viral  marketing,  prevention  of  drug  use,  vaccination,  and 
power  plant  failure.  Though  MANCaLog  programs  are  a  rich 
model  to  work  with,  the  acquisition  of  rules  is  the  principal 
hurdle  to  overcome;  this  is  mainly  due  to  this  richness  of 
representation,  since  for  each  rule  we  must  provide  a  set 
of  conditions  on  the  agents  being  influenced,  conditions  on 
their  neighbors  and  their  ties  to  their  neighbors,  and  how 
capable  these  neighbors  are  of  influencing  them.  A  domain 
expert  is  likely  able  to  provide  important  insights  into  these 
components,  but  the  best  way  to  obtain  these  rules  is  un¬ 
doubtedly  to  leverage  the  presence  of  large  amounts  of  data 
in  domains  like  Twitter  (with  about  340M  messages  sent  per 
day,  available  through  public  APIs),  Facebook  (over  950M 
users  with  more  complex  information;  not  publicly  available, 
but  data  can  be  requested  through  apps),  and  blogging  and 
photo  hosting  sites  such  as  Blogger  and  Flickr  (which  have 
millions  of  users  as  well). 
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Figure  2:  An  architecture  for  obtaining  MANCaLog 
programs  from  available  data  sources. 


Concretely,  we  have  begun  working  towards  this  goal  by 
extracting  several  time-series,  multi-attribute  network  data 
sets  on  which  to  apply  MANCaLog.  For  instance,  to  study 
the  proliferation  of  research  on  different  topics,  we  looked 
at  research  on  “niacin”  indexed  by  Thomson  Reuters  Web 
of  Knowledge  (http://wokinfo.com).  This  topic  was  cho¬ 
sen  due  to  its  interest  to  a  variety  of  disciplines,  such  as 
medicine,  biology,  and  chemistry;  this  gives  the  data  more 
variety  compared  with  more  discipline-specific  topics.  We 
extracted  an  author-paper  bipartite  network  consisting  of 
3,  790  papers  with  10, 465  authors  and  16,  722  edges  (cf.  Fig¬ 
ure  3);  from  this  data  we  can  easily  focus  on  various  kinds  of 
networks  (co-author,  citation,  etc.).  We  have  also  collected 
attribute  and  time-series  data  for  this  network,  as  well  as 
the  subjects  of  the  papers;  the  propagation  of  these  subjects 
is  a  good  starting  point  to  test  methods  for  the  acquisition 
of  MANCaLog  rules.  We  are  harvesting  larger  datasets  from 
various  online  social  networks.  Further  details  can  be  found 
in  the  full  version  of  the  paper. 

A  proposed  learning  architecture.  We  are  currently 
developing  a  MANCaLog  learning  architecture  (depicted  in 
Fig.  2)  based  on  the  use  of  state-of-the-art  data  analysis, 
clustering,  and  influence  learning  techniques  as  building  blocks 
for  the  acquisition  of  MANCaLog  rules  from  data  sets.  The 
key  question  is  not  just  the  identification  of  the  best  tech¬ 
niques  to  adopt,  but  how  to  adapt  them  and  combine  them 
in  such  a  way  as  to  produce  meaningful  and  useful  outputs. 

Consider  the  diagram  in  Fig.  2:  the  data  first  flows  from 
raw  data  sources  to  the  cluster  identification  component, 
which  has  the  goal  of  identifying  sets  of  agents  behaving 
as  groups  (for  instance,  teens  influencing  other  teens  of  the 
same  sex  in  the  consumption  of  music,  or  scientists  of  a  cer¬ 
tain  field  influencing  the  research  topics  of  others  in  a  related 
field)  [16,  9];  the  main  output  here  is  a  set  of  conditions  on 
nodes  and  edges  that  characterize  groups  of  nodes.  Once 
clusters  are  identified,  the  influence  recognition  component 
will  make  use  of  both  the  clusters  and  the  data  sources  to 
recognize  what  kind  of  influence  is  present  in  the  system  [1, 

5,  6];  the  main  output  of  this  component  is  the  influence 
function  to  be  used  in  the  MANCaLog  rules.  The  rule  gen¬ 
eration  component  then  takes  the  output  of  the  cluster  iden¬ 
tification  and  influence  recognition  components,  along  with 
the  raw  data  ( e.g .,  to  analyze  time  stamps)  and  produces 
MANCaLog  rules;  the  output  of  this  component  is  involved 
in  a  refinement  cycle  with  experts  who  can  provide  feedback 
on  the  rules  being  produced  (such  as  possible  combinations 
of  rules,  identification  of  cases  of  overfitting,  etc.). 


Figure  3:  (Left)  Visualization  of  a  multi-attribute 
time-series  author-paper  network  from  1952  to  2012. 
(Top-Right)  Close-up  of  the  data  inside  the  small  box 
in  the  main  figure.  (Bottom- Right)  Close-up  showing 
node  attributes.  In  all  cases,  authors  are  colored 
green  and  papers  are  colored  red.  Data  extracted 
from  Thomson  Reuters  Web  of  Knowledge. 


5.  CONCLUSION 

In  this  paper,  we  presented  the  MANCaLog  language  for 
modeling  cascades  in  multi-agent  systems  organized  in  the 
form  of  complex  networks.  We  started  by  establishing  seven 
criteria  in  the  form  of  desiderata  for  such  a  formalism,  and 
proved  that  MANCaLog  meets  all  of  them;  to  the  best  of  our 
knowledge,  this  has  not  been  accomplished  by  any  previous 
model  in  the  literature.  We  also  note  that  MANCaLog  is  the 
first  language  of  its  kind  to  consider  network  structure  in 
the  semantics,  potentially  opening  the  door  for  algorithms 
that  leverage  features  of  network  topology  in  more  efficiently 
answering  queries.  Our  current  work  involves  implementing 
the  algorithms  described  in  this  paper,  as  well  as  the  real- 
world  applications  described  in  Section  4;  though  our  al¬ 
gorithms  have  polynomial  time  complexity,  it  is  likely  that 
further  optimizations  will  be  needed  in  practice  to  ensure 
scalability  for  very  large  data  sets. 

In  the  near  future,  we  shall  also  explore  various  types 
of  queries  that  have  been  studied  in  the  literature,  such 
as  finding  agents  of  maximum  influence,  identifying  agents 
that  cause  a  cascade  to  spread  more  quickly,  and  identifying 
agents  that  can  be  influenced  in  order  to  halt  a  cascade. 
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8.  APPENDIX 


8.1  Set  of  interpretations  form  a  complete  lat¬ 
tice 

With  top  interpretation  T  and  bottom  interpretation  J_, 
(I,  C)  is  a  complete  lattice. 

Proof.  Let  1'  be  a  subset  of  1.  We  can  create  inf  (I1 ) 
as  follows.  We  build  interpretation  I'.  For  each  t  £  r,c  £ 
Q,L  £  C,  let  i\  be  the  least  of  the  set  U/gi/{7|(L,  [£,u])  £ 
7(f)(c),  (L,  [£,  u))  £  I{t){c)}  and  £ 2  be  the  least  of  the  set 
UIexl{e\{L,(e,u])  £  I{t){c),(L,{£,u))  £  7(f)(c)}.  Then,  for 
each  t  £  r,  c  £  Q,  L  £  C  let  Mi  be  the  greatest  element  of  the 
set  UJei/{u|(L,  [ i,u ])  £  7(f)  (c),  (L,  {£,u])  £  7(f)(c)} 
and  u 2  be  the  greatest  of  the  set 

U/gi'{u|(L,  [ l,u ))  £  7(f) (c),  (L,  (£,u))  £  7(f)(c)}.  If  there  is 
any  interpretation  I  in  I  where  there  is  not  some  bnd  s.t. 
(L,  bnd)  £  7(f) (c)  then  add  (L,  [0, 1])  to  7'(f)(c).  If  £2  <  £1 
and  u\  >  U2  then  add  (L,  (£2,  Mi])  to  7'(t)(c).  If  i 2  <  £1  and 
U2  >  u  1  then  add  (L,  {£2,112))  to  7'(t)(c).  If  £2  >  £1  and 
U2  >  Ui  then  add  (L,  [7i,M2)}  to  7'(f)(c).  Finally,  if  £2  >  £\ 
and  in  >  U2  then  add  (L,  [£\,u\])  to  7'(t)(c).  Clearly,  7'  = 
inf{T'). 

In  the  next  part  of  the  proof,  we  show  we  can  create 
sup{T')  as  follows.  We  build  interpretation  I' .  For  each 
t  £  r,  c  £  Q,  L  £  C  let  l\  be  the  greatest  of  the  set 
CIex'{£\{L,[£,u])  £  7(f)(c),  {L,  [£,  it))  £  7(f)(c)}  and  £2  be 
the  greatest  of  the  set 

U/gi'{7|  (L,{£,u\)  £  7(f)(c),  (L,  {£,  u))  £  7(t)(c)}.  Then,  for 
each  t  £  t,  c  £  Q ,  L  £  C  let  Mi  be  the  least  element  of  the  set 
Uj-6i'{m|(L,  [£,  m])  £  7(f)(c),  (L,  {£,  u])  £  7(t)(c)}  and  U2  be 
the  least  of  the  set  U/gi/{u|(L,  [£,  u))  £  7(f)  (c),  (L,  (£,u))  £ 
7(f)(c)}.  If  max(7i,72)  >  min(r(i,U2)  or  (£2  >  £1)  A  (U2  < 
Mi)  A  {£2  =  U2)  then  add  {L,  0)  to  7'(f)(c).  If  £2  >  £1  and 

Mi  <  U2  then  add  (L,  (I?2,mi])  to  7'(f)(c).  If  £2  >  £1  and 

M2  <  Mi  then  add  {L,  {£2,112))  to  7'(f)(c).  If  £2  <  £1  and 

M2  <  tii  then  add  (L,  [7i,M2))  to  7'(f)(c).  Finally,  if  £2  <  £1 

and  Mi  <  M2  then  add  {L,[£ i,Mi])  to  7'(f)(c).  Clearly, 
7'  =  sup{l'). 

As  both  inf{T')  and  sup{X')  exist  and  are  clearly  in  X  then 
the  statement  follows.  □ 


8.2  A  single  application  of  r  can  be  computed 
in  polynomial  time 

For  interpretation  7,  T(7)  can  be  computed  by  conducting 
0{\P\-\V\-tmax-dlfl)  satisfaction  checks  where  d™  is  the  max¬ 
imum  in-degree  of  a  node  in  the  network.  (This  combined 
with  the  assumption  that  the  influence  function  is  computed 
in  constant  time  results  in  polynomial  time  computation  for 
a  single  application  of  F.) 

Proof.  We  note  that  a  given  rule  will  require  the  most 
satisfaction  checks,  as  a  rule  will  potentially  affect  a  network 
atom  of  a  certain  label  for  each  vertex-time  point  pair.  By 
the  definition  of  RBnd ,  a  given  rule  clearly  causes  no  more 
than  0(d*n)  satisfaction  checks.  As  the  number  of  rules  is 
no  more  than  |P|,  the  statement  follows.  □ 


8.3  Proof  of  Theorem  3.1 

Given  interpretation  7  and  program  P ,  there  exists  a  nat¬ 
ural  number  k  s.t.  Tp(7)  =  Tkp*~1(I),  and 

k€C>(\P\-din  -tmax-\E\) 

where  d is  the  maximum  in-degree  in  the  network. 

Proof.  For  a  given  vertex  i  £  V,  we  will  use  the  notation 
d)n  to  denote  the  number  of  incoming  neighbors  (of  any 
edge  type)  and  =  maxid;”.  First  we  show  that  for  a 
given  t  £  t,  i  £  V,  and  L  £  C,  a  given  rule  r  can  tighten 
the  bound  on  a  network  atom  formed  with  L  no  more  than 
(d(n  +  1)  •  (dl"  +  1)  times.  This  is  because  a  given  rule 
adjusts  the  bound  on  a  network  atom  based  on  the  number 
of  eligible  and  qualifying  neighbors,  which  are  bounded  by 
respectively.  At  each  application  of  T,  at  least  one 
network  atom  must  tighten.  Hence,  as  there  are  only  O  ^|P|  ■ 

dln  ■  tmax  •  J2idin')  =C>(|P|  •  dT  ■  tmax  ■  |P|)  tightenings 
possible,  this  is  also  the  bound  on  the  number  of  applications 
of  F.  □ 

8.4  Proof  of  Lemma  3.2 

If  I  \=  P  and  I'  C  7  then  F(l')  C  I. 

Proof.  Suppose,  BWOC,  that  r(J')  □  7.  Then,  there 
exists  some  L  £  C,  t  £  r  and  c  £  Q  s.t.  ( L ,  bnd)  £  I(t)(c), 
( L ,  bnd')  £  I'(t)(c),  and  ( L ,  bnd”)  £  r(7')(f)(c)  s.t.  bnd  D 
bnd"  and  bnd'  I)  bnd".  There  are  four  things  that  affect 
bnd":  facts,  rules,  integrity  constraints  and  bnd' .  Clearly, 
we  need  not  consider  the  effect  that  either  facts  or  bnd' 
have  on  bnd ",  as  7  satisfies  all  facts  and  I'  C  I.  We  also 
note  that  a  given  integrity  constraint  imposed  by  Defini¬ 
tion  3.2  can  tighten  bnd "  no  more  than  the  associated  bound 
in  any  model.  Hence,  there  must  be  some  rule  r  =  L 
f,{gedge,9node,h)ifl  that  causes  bnd"  to  become  less  than 
bnd.  As  bnd"  ^  bnd',  we  know  that  t  £  TTS(T(I'),c,r)  n 
TTS(I' ,  c,  r).  As  a  result,  we  have  —  Af)(c)  |=  /  and 

I'(t  —  At)(c)  |=  /.  Further,  as  7  |=  P,  I'  C  7,  and  no  rule 
can  modify  a  non-fluent  atom,  we  have 

Elig  (v ,  Qedge ,  Qnode  ,  F(7  )  (t  At)|  = 

\Elig(v,  gedge,  g-node,E{t  -  At)  \  = 

|  EUg  ( V ,  gedge ,  9node  ,  E  (7  )(t  At)  | . 

Further,  we  know  that  as  7'  C  7,  it  must  be  the  case  that 
|  Qual(v,  gedge ,  9node,  A,  7(f  At))|  P 

|  Qual  (V,  gedge,  gnode  ,h,l'(t~  At)  )  | . 

This  implies,  by  Axiom  2  that,  Bound{r,v,I{t  —  At))  C 
Bound(r,  v,  I'(t  —  At)).  This  then  implies  that  bnd  C  bnd" , 
which  is  a  contradiction.  □ 

8.5  Proof  of  Theorem  3.3 

Tp  is  a  minimal  model  for  P. 

Proof.  Claim:  If  program  P  is  consistent  then  Tp  is  a 
model  of  P. 

Suppose,  BWOC,  that  there  is  a  fact  in  P  that  Tp  does  not 
satisfy.  However,  by  the  definition  of  F  and  the  definition  of 
a  fact,  Fp  must  satisfy  all  facts  as  the  bound  on  the  weight 
associated  with  each  fact  is  included  in  the  intersection.  Fur¬ 
ther,  we  can  also  see  by  the  definition  of  T  that  Tp  strictly 


satisfies  all  non- fluent  facts  in  P.  We  also  note  that  the  fi¬ 
nal  application  of  the  T  operator  ensures  that  all  integrity 
constraints  are  satisfied  by  Fp.  Now,  Suppose,  BWOC,  that 
there  is  a  rule  in  P  that  Tp  does  not  satisfy.  However,  with 
each  application  of  T,  for  each  rule,  we  include  the  bound 
on  the  weight  returned  by  the  Bound  function  for  each  time 
step  in  the  target  time  step  associated  with  that  rule.  As  T 
is  applied  to  convergence,  and  new  bounds  are  intersected 
with  each  application,  then  we  know  that  all  time  points  in 
any  associated  target  time  set  are  considered  in  the  inter¬ 
section. 

Proof  of  Theorem:  The  above  claim  tells  us  that  Fp  \=  P. 
Now  consider  interpretation  7  s.t.  7  |=  P.  As  1  C  7,  multi¬ 
ple  applications  of  Lemma  3.2  tell  us  that  Fp  C  7.  Hence, 
the  statement  follows.  □ 

8.6  Proof  of  Theorem  3.4 

If  P  is  consistent,  then  CANON_PROC(P)  is  the  minimal 
canonical  model  of  P. 

Proof.  CLAIM  1:  If  P  is  consistent,  then  CANON_PROC(P) 
is  a  canonical  model  of  P. 

Clearly,  7  =  CANON_PROC(P)  satisfies  all  facts  and  in¬ 
tegrity  constraints  in  P.  Hence,  we  shall  consider  programs 
that  only  consist  of  rules  in  this  proof.  We  say  7  L-canonically 
satisfies  P  iff  7  canonically  satisfies  {r  £  P  head{r)  =  L}. 
Clearly,  7  canonically  satisfies  P  if  for  all  L  £  C,  P  L- 
canonically  satisfies  by  7.  We  say  that  7  is  an  (L,  c,  q)- 
canonically  consistent  interpretation  if  for  c  £  Q,  for  the 
first  t  £  t  —  TTS(I,  c,  L,  P)  —  {0},  7(t)(c)  \=  ( L ,  bnd)  where 
(L,bnd)  £  I(t  —  l)(c).  Consider  some  L  £  C  and  c  £  Q. 
Clearly,  7  is  an  ( L ,  c,  0)-model  for  P.  Let  us  assume,  for 
some  value  q,  that  7  is  an  ( L ,  c,q—  1)  model  for  P.  Let  time 
point  t  be  the  q-th  element  of  r  —  TTS(I,c,  L,  P)  —  {0}. 
Consider  the  time  step  before  time  t  is  considered  in  the  for- 
loop  at  line  4  of  CANON_PROC,  which  causes  the  condition 
at  line  5  to  be  true.  By  line  13,  r  —  TTS{I,  c,  L,  P)  —  {0}  C 
cur_free[c][L\.  This  means  that  t  is  a  member  of  both. 
Hence,  when  t  is  considered  at  line  4,  the  condition  at  line  5 
is  true,  causing  (L,bnd)  £  I(t)(c)  n  7((  -  l)(c)  and  as  the 
element  (L,bnd)  £  I (t  —  l)(c)  is  not  changed  here,  we  have 
shown  the  7  is  an  ( L ,  c,  q)-model  for  P.  By  the  for-loop  at 
line  6,  for  all  L  £  C  and  c  £  Q ,  I  is  an  ( L ,  c,  q)-model  for  P. 
Hence,  at  the  for-loop  at  line  4,  we  can  be  assured  that  for 
L  £  C  and  c  £  Q  that  7  (L,  c,  | t  —  TTS(I,  c,  L,  P)  —  {0} | ) 
satisfies  P  -  which  means  that  7  canonically  satisfies  P 
CLAIM  2:  If  7  is  a  canonical  model  for  P, 
curjinterp  C  7  is  an  interpretation  that  also  strictly  satisfies 
all  non- fluent  facts  in  P,  and  cur -inter p'  is  cur -inter p  af¬ 
ter  being  manipulated  in  lines  6-10  of  CANOISLPROC,  then 
cur-interp'  C  7. 

We  note  that  by  the  definition  of  satisfaction  of  a  non¬ 
fluent  fact,  and  the  fact  that  both  curjinterp  and  7  must 
strictly  satisfy  all  non-fluent  facts  in  P,  we  know  that  for  all 
c  £  Q  and  L  £  C  that: 

TTS(I,c,L,P)  —  TTS{cur-interp,c,L,P) 

=  TT S (curjinterp' ,  c,  L,  P) 

Let  us  assume  that  lines  6-10  of  the  algorithm  are  changing 
curjinterp  when  the  outer  loop  is  considering  time  t  and 
that  the  condition  at  line  5  is  true.  Clearly, 

t  £  t  —  TTS(I,  v' ,  L' ,  P)  —  {0} 


As  a  result,  for  any  (v,  L)  pair  considered  at  this  point 
by  the  algorithm,  if  (L,bnd)  £  I(t)(v)  and  ( L,bnd ')  £ 
I(t  —  l)(u)  then  we  have  bnd  =  bnd' .  By  the  algorithm, 
if  we  have  (L,  bnd*)  £  cur_interp'(t)(v)  and  (L,  bnd**)  £ 
cur -inter p' (t  —  l)(v)  we  have  that  bnd*  =  bnd** .  As 
(L,  bnd**)  £  cur_interp(t  —  l)(v),  we  know  that  bnd '  C 
bnd** .  As  a  result,  we  have  curAnterp1  □  I,  completing  the 
claim. 

Proof  of  theorem:  As  initially  curAnterp  =  Tp  and  Tp  C  I 
by  Theorem  3.3,  we  note  that  the  algorithm  changes 
curAnterp  either  by  applying  P  or  manipulating  it  in  lines  6- 
10,  which  tells  us  (by  claim  2)  that  for  all  models  /  of  P  that 
CAI\ION_PROC(P)  C  I.  Since  by  claim  1  we  know  that 
CANOI\LPROC(P)  \=  P ,  the  statement  of  the  theorem  fol¬ 
lows.  □ 

8.7  Details  on  the  Extracted  Dataset 

One  way  in  which  MANCaLog  can  be  used  is  looking  at 
proliferation  of  research  on  different  topics.  We  look  at 
research  conducted  on  niacin,  an  organic  compound  com¬ 
monly  used  for  increasing  levels  of  high  density  lipopro¬ 
teins  (HDL).  Using  Thomson  Reuters  Web  of  Knowledge 
(http://wokinfo.com)  we  were  able  to  extract  information 
on  4,  202  articles  about  niacin.  This  information  was  then 
processed  using  the  Science  of  Science  (Sci2)  Tool  (http: 
//sci2 .  cns .  iu .  edu)  to  extract  numerous  different  networks 
such  as  author  by  paper  networks,  citation  networks,  and  pa¬ 
per  by  subject  networks.  Each  paper  has  attributes  about 
when  it  was  published,  what  journal  it  was  published  in,  and 
what  subjects  the  paper  was  about.  During  the  first  time 
period  there  is  a  total  of  508  papers  with  856  different  au¬ 
thors  and  1,  231  connections  based  on  an  author  being  cited 
as  an  author  of  a  given  paper.  During  the  second  time  pe¬ 
riod,  there  is  a  total  of  3,  790  papers  with  10, 465  different 
authors  and  16,  772  connections. 


